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ABSTRACT 


In  this  technical  report,  we  propose  that  the  use  of  two  simple  behavioral  measures,  in 
conjunction  with  neurophysiological  measures,  can  be  used  to  create  a  training 
intervention  that  has  the  potential  to  provide:  (1)  real-time  notification  as  to  when  a 
training  intervention  is  needed,  and  (2)  real-time  infonnation  as  to  the  type  of  training 
intervention  that  should  be  employed.  The  Cognitive  Alignment  with  Performance 
Targeted  Training  Intervention  Model  (CAPTTIM)  determines  if  a  trainee's  cognitive 
state  is  aligned  or  misaligned  with  actual  performance.  When  misalignment  occurs,  it 
indicates  that  a  training  intervention  is  needed.  Neurophysiological  markers  as  captured 
by  eyetracking  and  electroencephalography  (EEG)  can  assist  in  detennining  why 
misalignment  between  cognitive  state  and  perfonnance  occurred,  leading  to  more 
effective  and  targeted  training  intervention.  Because  all  measures  are  captured 
continuously  in  real  time,  this  model  has  the  potential  to  increase  training  efficiency  and 
effectiveness  in  a  variety  of  training  domains.  The  model  is  illustrated  with  two 
case  studies. 
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EXECUTIVE  SUMMARY 


A.  MOTIVATION 

As  the  Army  focuses  on  enhancing  leader  development  and  decision  making  to 
improve  the  effectiveness  of  combat  forces,  the  importance  of  understanding  how  to 
effectively  train  decision  makers  and  how  experienced  decision  makers  arrive  at  optimal 
or  near-optimal  decisions  has  increased.  Currently,  there  is  little  understanding  of  how 
military  decision  makers  arrive  at  optimal  decisions  and  the  measurement  of  decision¬ 
making  performance  lacks  objectivity.  The  combined  use  of  behavioral  and 
neurophysiological  measures  in  human-in-the-loop  wargames  has  the  potential  to  fill  this 
knowledge  gap  and  provide  more  objective  measures  of  decision-making  perfonnance. 

B.  PURPOSE 

This  project’s  purpose  is  to  investigate  the  role  between  neurophysiological 
indicators  and  optimal  decision  making  in  the  context  of  military  scenarios,  as 
represented  in  human-in-the-loop,  wargaming  simulation  experiments.  We  focused  on 
the  development  of  optimal  decision  making  when  all  subjects  begin  as  naive  decision 
makers.  Specifically,  we  attempted  to  identify  the  transition  from  exploring  the 
environment  as  a  naive  decision  maker  to  exploiting  the  environment  as  an  experienced 
decision  maker,  via  statistical  and  neurological  measures. 

C.  ARMY  RELEVANCY  AND  MILITARY  APPLICATION  AREAS 

Objectively  defining,  measuring,  and  developing  a  means  to  assess  military 
optimal  decision  making  has  the  potential  to  enhance  training  and  refine  procedures 
supporting  more  efficient  learning  and  task  accomplishment.  Through  the  application  of 
these  statistical  and  neurophysiological  models,  we  endeavor  to  further  neuromathematics 
and,  with  it,  advance  the  understanding  and  modeling  of  decision-making  processes  to 
more  deeply  comprehend  the  fundamentals  of  Soldier  cognition. 


xiii 


D. 


SUMMARY  OF  CURRENT  STATUS 


We  developed  a  wargame  and  conducted  a  study  that  demonstrated  that  it 
successfully  elicits  cognitive  flexibility  and  reinforcement  learning.  Based  on 
quantitative  measures  of  exploration  and  exploitation,  we  developed  the  Cognitive 
Alignment  with  Performance  -  Targeted  Training  Intervention  Model  (CAPTTIM). 
Based  on  real-time  measures  of  a  trainee’s  cognitive  state  and  their  actual  perfonnance, 
the  model  proposes  a  method  for  identifying  (1)  whether  or  not  a  trainee’s  cognitive  state 
is  aligned  or  misaligned  with  actual  perfonnance,  and  (2)  possible  reasons  as  to  why 
cognitive  misalignment  is  occurring.  We  find  that  the  combination  of  knowledge  of 
cognitive  state  and  actual  decision  perfonnance  gives  insight  into  the  optimality  of 
trainees’  decisions. 
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I.  INTRODUCTION 


A.  OVERVIEW 

As  the  U.S.  Army  focuses  on  enhancing  leader  development  and  decision  making 
to  improve  the  effectiveness  of  its  combat  forces,  the  importance  of  understanding  how  to 
effectively  train  decision  makers  and  how  experienced  decision  makers  arrive  at  optimal 
or  near-optimal  decisions  has  increased  (Lopez,  2011).  In  order  to  understand  how  to 
effectively  train  decision  makers  to  make  optimal  decisions,  there  are  at  least  two 
components  that  need  to  be  understood  and  quantitatively  characterized.  One  such 
component  is  the  cognitive  state  of  the  decision  maker  trainee:  do  they  think  they  need  to 
leam  more  about  the  environment  before  they  can  make  good  decisions  or  do  they  think 
they  are  making  good  decisions?  In  our  work,  we  call  this  first  cognitive  state 
exploration :  needing  to  leam  about  one’s  environment  and  actively  seeking  and 
responding  to  infonnation  in  the  environment.  We  refer  to  the  latter  state  as  exploitation : 
thinking  that  you  have  figured  out  the  task  and  acting  on  that  knowledge. 

A  second  component  of  understanding  optimal  military  decision  making  is  having 
an  objective  measure  of  a  trainee’s  actual  decision  perfonnance.  Ideally,  this  measure 
should  provide,  at  any  point  during  the  task,  information  as  to  how  close  a  trainee  is  to 
making  optimal  decisions.  It  is  important  to  note  that  both  components,  knowledge  of  the 
decision  maker’s  cognitive  state  and  a  measure  of  their  actual  decision  performance  are 
necessary  to  truly  understand  optimal  military  decision  making.  In  the  process  of 
operationalizing  the  definitions  of  exploration  and  exploitation,  and  detennining  an 
objective  measure  of  decision  perfonnance,  we  developed  the  Cognitive  Alignment  with 
Perfonnance-Targeted  Training  Intervention  Model  (CAPTTIM).  The  purpose  of  this 
paper  is  to  describe  the  model  and  then  to  illustrate  how  the  model  works  through  two 
case  studies.  We  first  describe  how  we  operationalized  exploration  and  exploitation,  and 
our  measure  of  optimal  decision  perfonnance. 
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B.  OPERATIONALIZATION  OF  EXPLORATION  AND  EXPLOITATION 
VIA  TME  MONITORING  OF  SEQUENTIAL  SAMPLE  VARIANCES 


We  hypothesize  that  variability  in  latency  times  could  be  used  as  a  way  to 
operationally  define  the  cognitive  states  of  exploration  and  exploitation.  Specifically,  we 
expect  that  high  variability  in  latency  times  is  indicative  of  seeking,  responding,  and 
synthesizing  information  that  occurs  with  exploration,  whereas  low  variability  in  latency 
times  signifies  exploitation. 

One  method  for  monitoring  latency  variability  is  via  a  sequential  scheme,  where 
the  variance  of  a  latency  measure  is  repeatedly  estimated  from  moving  windows  of  data. 
Specifically,  letx/  denote  the  latency  at  time  i,  i  =  2,  3, ... ,  200.  Then,  for  some  window 
of  data  of  size  w  +  1,  starting  at  time  i  =  w  +  2,  sequentially  calculate 


where 


1 

w 


j=i-w 


X 

i 


1 

w  +  l 


j-i-w 


The  idea  is  to  monitor  sl+2  sl+ 3  s2w+A  . . .  and  when  the  sequence  of  sample  variances  is  less 


than  some  threshold  h,  we  declare  that  the  subject  has  gone  from  exploration  to 
exploitation. 

For  this  method,  one  question  is  how  to  choose  w.  There  are  two  considerations: 
(1)  ideally  w  +  1  should  be  smaller  than  the  smallest  length  of  time  that  a  subject  might 
be  in  exploration  mode  when  the  experiment  first  starts,  and  (2)  smaller  values  of  w  are 
better  in  the  sense  that  the  method  will  more  quickly  indicate  the  shift  to  exploitation,  but 
w+1  cannot  be  so  small  that  the  sample  standard  deviation  estimates  are  too  variable 
because  of  excess  noise.  Ultimately,  we  will  want  to  do  some  simulations  to  see  what  a 
good  choice  for  w  might  be.  Our  initial  guess  would  be  something  in  the  range  of 
5  <  w  <  20  or  so. 

A  second  question  is  how  to  choose  h.  The  planned  approach  will  be  to 
subjectively  compare  how  well  various  values  of  h  differentiate  between  exploration  and 
exploitation,  as  determined  by  various  other  external  measures,  such  as  those  from  the 
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EEG,  on  a  training  set  of  data.  The  value  of  h  that  perfonns  best  will  then  be  applied  to 
the  remaining  data. 

Finally,  there  is  also  a  question  of  whether  and  how  to  detect  if  someone  reverts 
from  exploitation  back  to  exploration.  One  possibility  is  to  continue  to  monitor  the 
sample  variances  and,  once  someone  is  in  exploration  mode,  should  sf>h,  conclude  that 
they  have  reverted  back  to  exploration.  However,  it  may  be  that  we  need  two  thresholds, 
call  them  h\  and  hi,  where  /?2  >  h\,  which  would  work  as  follows.  For  someone  in 
exploration  mode,  they  switch  to  exploitation  at  time  i  when  sf  <  \  ,  while  for  someone 

in  exploitation  mode,  they  only  switch  to  exploration  at  time  i  when  ,s\2  >  h2-  The  key 
idea  here  is  that  having  two  thresholds  with  some  separation  between  them  may  decrease 
inadvertent  (i.e.,  excessive)  switching  back  and  forth  between  modes  due  to  noise  in 
the  data. 

C.  MEASURE  OF  REGRET  AS  A  OBJECTIVE  MEASURE  OF  DECISION 
PERFORMANCE 

Regret  provides  a  measure  of  deviations  from  the  ideal  decision  path,  at  any  given 
point  in  a  task.  Regret  is  the  difference  of  a  trainee’s  single  trial  outcome  and  the 
outcome  from  the  ideal  decision,  given  perfect  knowledge.  Less  regret  is  better;  on  any 
given  trial,  regret  can  be  zero  if  the  trainee  selects  the  best  decision.  More  generally, 
absolute  regret  compares  the  outcome  of  trainee  actions  to  the  outcome  generated  by 
playing  the  optimal  policy  at  each  of  the  n  trials.  Given  K  >  2  routes  and  sequences  rlA, 
Ei,2 —ri,n  of  unknown  outcomes  associated  with  each  route  i  =  1  at  each  trial, 
t  =  l,...n,  trainees  select  a  route  I,  and  receive  the  associated  outcomes  r/u.  Let  r*t be  the 
best  possible  outcome  possible  from  route  i  on  trial  t  (Auer  &  Ortner,  2010).  The  regret 
after  n  plays  /,,.../„  is  defined  by 


Rn=1Lri,,-1Lri,f 

t-\- 1  ?+l 

Regret  provides  insights  in  the  aggregate  over  the  course  of  a  set  of  n  trials  (i.e.,  total 
regret)  and,  when  examined,  per  trial.  Regret  per  trial  provides  a  measure  of  a  trainee’s 
ability  to  identify  the  best  choice  available  at  a  given  point  in  time. 
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D.  USE  OF  NEUROPHYSIOLOGICAL  MEASURES  TO  PROVIDE 

INSIGHTS  INTO  WHY  NONOPTIMAL  DECISION  MAKING  OCCURRED 


Numerous  studies  indicate  that  eye-movement  data  via  eye-tracking  technology 
can  provide  valuable  insights  into  subjects’  attention  allocation  patterns  and  underlying 
cognitive  strategies  during  real-world  tasks  (Kasarskis,  Stehwien,  Hickox,  Aretz,  & 
Wickens,  2001;  Marshall,  2007;  Sullivan,  Yang,  Day,  &  Kennedy,  2011). 

E.  CAPTTIM 

Figure  1  outlines  the  main  component  of  CAPPTIM:  determining  if  a  trainee’s 
cognitive  state  is  aligned  or  misaligned  with  their  actual  performance.  When  cognitive 
state  is  misaligned  with  actual  performance,  it  indicates  that  a  training  intervention  is 
required.  As  illustrated  in  Figure  1,  a  trainee  typically  would  start  in  the  yellow  cell,  in 
which  they  are  in  exploration  mode  and  their  decision  performance  is  nonoptimal. 
Ideally,  at  some  point  during  the  task,  the  trainee  transitions  to  the  green  cell,  in  which 
they  are  in  exploitation  mode  and  their  decision  performance  is  optimal,  as  indicated  by 
low  regret.  When  a  trainee’s  cognitive  state  is  misaligned  with  actual  decision 
perfonnance,  training  intervention  should  occur  (orange  and  red  cells).  Given  that 
latency  variance  and  regret  can  be  measured  in  real  time,  the  combination  of  these  two 
measures  can  be  used  as  a  simple,  near-immediate  indicator  of  training  intervention. 


High 

Regret 


Decision 

Performance 


Low 

Regret 


Cognitive  State 

Exploration  Exploitation 


Seeking  information,  and  decision 
performance  is  not  optimal. 


Remaining  in  the 
yellow  cell  for  too 
long  can  be  a 
concern. 


Training  intervention 
required. 


Seeking  information,  yet,  decision 
performance  is  optimal. 


Acting  upon  acquired  knowledge, 
and  decision  performance  is 
optimal. 


Figure  1 .  Illustration  of  the  main  components  of  CAPTTIM. 
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The  model  determines  whether  cognitive  state,  exploration  or  exploitation,  is 
aligned  or  misaligned  with  actual  decision  performance,  as  measured  by  regret.  The 
alignment  or  misalignment  is  an  indicator  of  the  quality  of  the  decisions  and  the  trainee’s 
mastery  of  the  task.  When  misalignment  occurs,  a  training  intervention  is  required. 
Misalignment  can  occur  for  several  reasons,  such  as  lack  of  focus  on  the  relevant 
information,  distraction,  sleepiness,  or  high  cognitive  workload. 

Next,  the  incorporation  of  neurophysiological  measures,  such  as  eye  tracking  and 
electroencephalography  (EEG),  can  provide  an  understanding  as  to  why  a  trainee’s 
cognitive  state  and  actual  performance  are  misaligned  (see  Figure  2  and  Table  1). 
Understanding  why  misalignment  between  cognitive  state  and  decision  performance 
occurred  can  inform  the  type  of  training  intervention  that  should  be  done.  For  example, 
perhaps  a  trainee  is  in  the  red  cell  simply  because  they  are  not  attending  to  the  most 
relevant  pieces  of  information.  In  this  case,  an  attention  allocation  intervention  could  be 
employed.  A  trainee  in  the  orange  cell  may  be  experiencing  an  overly  high  cognitive 
workload  during  the  task  and  therefore  does  not  have  the  cognitive  capacity  to  realize  that 
they  are  performing  well.  In  this  case,  an  intervention  that  uses  very  strong  positive 
feedback  could  help  the  trainee  realize  that  they  actually  have  figured  out  the  task.  Thus, 
these  initial  results  suggest  that  highly  efficient  and  targeted  training  interventions  can 
occur  with  the  combined  use  of  decision  performance,  time  to  make  a  decision,  eye¬ 
tracking,  and  EEG  information  monitored  in  real  time.  In  the  next  section,  we  illustrate 
CAPTTIM  with  two  case  studies. 


What  to  look  for. 

Figure  2.  Adapted  from  Land  &  Hayhoe  (2001),  this  figure  illustrates  how 
neurophysiological  data  can  infonn  why  nonoptimal  decision  making  occurred. 
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Table  1.  Outline  of  the  secondary  component  of  CAPTTIM:  targeting  the  training 
intervention.  Included  is  a  description  of  each  type  of  nonoptimal,  decision-making  error 
and  a  corresponding  possible  training  intervention. 


Error  level 

Description 

Possible  Training 
Intervention 

Attention(Level  1  errors) 

Information  from 
eyetracking  indicates  that  the 
person  was  not  looking  at  the 
salient  infonnation; 
therefore,  optimal  decision 
making  is  unlikely  to  occur. 

Attention  allocation  that 
directs  trainee’s  gaze  to  the 
salient  information. 

Perception  (Level  2  error) 

Information  from 
eyetracking  indicates  that  the 
person  glanced  at  the  salient 
infonnation,  but  not  long 
enough  for  it  to  register  in 
the  brain. 

Attention  allocation  that 
directs  trainee’s  gaze  to  the 
salient  information. 

Perception  (Level  3  error) 

Information  from 
eyetracking  indicates  that  the 
person  looked  at  the  salient 
infonnation,  and  long 
enough  for  that  information 
to  register  in  the  brain. 
However,  EEG  data  shows 
that  the  person  is 
experiencing  one  or  a 
combination  of  the 
following:  high  cognitive 
workload,  frequent 
distraction,  or  sleepiness. 

Different  training 
interventions  depending  on 
the  EEG  data. 

High  cognitive  workload: 
restart  the  task  at  a  lower 
level  of  difficulty. 

Distraction:  Focus  the 
trainee’s  attention  on  the 
task;  reduce  distraction  in  the 
surrounding  area. 

Sleepiness:  Trainee  should 
resume  the  task  at  a  later 
time. 

Decision  (Level  4  error) 

This  enor  occurs  due  to  the 
person  inconectly  using  past 
experience  or  preconceived 
notions  in  making  their 
decisions.  Infonnation  from 
eyetracking  and  EEG  rule 
out  level  1-3  errors.  The 
person  is  looking  at  the 
salient  infonnation  and  they 
are  not  experiencing  high 
cognitive  workload, 
distraction,  or  sleepiness. 

Increasingly  stronger 
visual/audio  cues  to  the 
trainee  that  their  current 
strategy  is  not  optimal. 

Strong,  immediate,  positive 
feedback  when  the  trainee 
makes  optimal  decisions. 

6 


F.  ILLUSTRATION  OF  CAPTTIM  WITH  CASE  STUDIES  FROM  THE 
CONVOY  TASK 


In  Kennedy,  Nesbitt,  and  Alt  (2014),  we  developed  and  tested  a  simple  wargame 
called  the  convoy  task  on  34  subjects,  all  of  whom  were  military  officers.  In  the  convoy 
task,  subjects  see  four  identical  roads  and  are  instructed  to  select  the  route  on  which  to 
send  their  convoy  (see  Figure  3).  Their  goal  is  to  have  the  highest  total  damage  score  by 
maximizing  the  damage  to  enemy  forces,  while  minimizing  the  friendly  damage  accrued 
over  all  trials.  Through  trial  and  error,  subjects  leam  which  routes  have  the  best  long¬ 
term  payoffs  in  damage.  On  each  trial,  the  subject  is  provided  immediate  feedback  in  the 
fonn  of  three  separate  pieces  of  information:  a  reward,  a  penalty,  and  a  running  total. 
The  reward — the  number  of  enemy  forces  damaged — is  called  Enemy  Damage.  On  any 
given  trial,  enemy  damage  ranges  from  50  to  100  damage.  The  penalty — the  number  of 
friendly  forces  damaged — is  called  Friendly  Damage.  Depending  on  the  route  chosen, 
friendly  damage  ranges  from  0  to  -1,250  damage.  The  running  total  is  called  Total 
Damage,  defined  as  the  previous  trial’s  value  of  Total  Damage  plus  the  previous  trial’s 
Damage  to  Enemy  Forces  minus  the  previous  trial’s  Damage  to  Friendly  Forces.  The 
units  of  value  are  in  damage.  Subjects  begin  the  task  with  2,000  damage.  The  main 
outcome  variable  is  Total  Damage  at  the  end  of  the  200  trials.  A  subject  selects  routes 
until  the  end,  not  knowing  that  the  task  will  complete  after  200  trials.  The  assumption  is 
that  the  subject  maintains  some  estimate  of  the  value  similar  to  Accumulated  Damage  for 
each  route  and  updates  the  estimate  after  each  trial.  The  accuracy  of  the  estimate  will 
vary  between  subjects,  as  will  the  manner  in  which  the  subjects  incorporate  infonnation 
indexed  by  trial  into  their  estimate. 

Each  route  has  its  own  scripted,  ordered  set  of  specified  values.  For  example, 
every  subject  will  find  that  the  third  time  they  pick  route  1,  it  returns  +100  enemy 
damage  and  -150  friendly  damage.  Even  though  these  returns  by  route  are  set  and  are 
the  same  for  each  trainee,  the  games  will  progress  differently  due  to  the  divergence  of 
route  selection  between  subjects. 
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Select  route  (or  next  convoy. 


2750 


100  -250 


Figure  3.  Screen  shot  of  the  convoy  task  in  piloting;  a  typical  subject’s  view  of  the 
task.  We  see  that  the  trainee’s  last  choice  caused  100  damage  to  the  enemy  (Damage  to 
Enemy  Forces)  and  a  loss  of -250  to  friendly  forces  (Damage  to  Friendly  Forces), 
resulting  in  a  trial  loss  of-150  (not  shown).  The  Accumulated  Damage  is  2,750.  A 
positive  Accumulated  Damage  value  is  desirable  to  the  trainee.  Notice  that  four  routes 

are  represented  by  the  same  image. 

G.  SEQUENTIAL  DETECTION  METHOD:  USING  LATENCY  DATA  TO 
DETERMINE  EXPLORATION  VS.  EXPLOITATION  COGNITIVE  STATES 

As  illustrated  in  Figures  4a  and  4b,  we  successfully  used  variability  in 
trial-by-trial  latency  time  to  detect  periods  of  exploration  and  exploitation  cognitive 
states.  A  single  explore/exploit  latent  threshold  was  developed  for  each  subject,  derived 
from  twice  the  standard  deviation  above  and  below  all  latency  times  for  0  or  50  friendly 
damage  (i.e.,  the  baseline  latency  time)  for  that  subject.  Therefore,  exploration  was 
defined  as  trials  in  which  the  latency  time  was  at  least  two  standard  deviations  (SD) 
higher  than  the  baseline  latency  time.  Exploitation  was  defined  as  two  SD  lower  than  the 
baseline  latency  time.  Note  that  these  definitions  do  not  take  into  account  actual  decision 
performance,  but  solely  the  subject’s  cognitive  state  at  a  given  time  in  the  task. 
Figures  4a  and  4b  depict  two  distinct  patterns  of  exploration  and  exploitation.  Figure  4a 
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depicts  an  optimal  exploration  to  exploitation  transition  (subject  14),  whereas  Figure  4b 
illustrates  a  pattern  of  primarily  exploration  throughout  most  of  the  task  (subject  33). 


Figures  4a  and  4b.  Use  of  sequential  sample  variances  in  latency  times  to  determine 
exploration  and  exploitation  cognitive  states.  Shaded  orange  regions  indicate  periods  of 
exploitation;  shaded  blue  regions  indicate  periods  of  exploitation. 

H.  COMBINING  SEQUENTIAL  DETECTION  METHODS  WITH  REGRET 

The  combination  of  trial-by- trial  information  regarding  the  subject’s  current 
cognitive  state  (exploration  or  exploitation)  with  actual  performance  (measures  of  regret) 
provides  insights  into  whose  cognitive  state  is  aligned  with  actual  performance.  Across 
the  34  subjects  who  completed  the  convoy  task,  clear  patterns  of  cognitive  alignment  and 
misalignment  are  seen.  We  illustrate  two  of  these  patterns,  exhibited  by  subjects  14  and 
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33,  in  Figures  5a  and  5b.  In  Figures  5a  through  5d,  we  see  that  although  subjects  14  and 
33  show  distinct  differences  in  cognitive  state,  their  cognitive  state  is  aligned  with  their 
measure  of  regret.  Subject  14  goes  through  a  period  of  exploration  until  about  trial  90,  at 
which  point  they  are  predominantly  in  exploitation  mode.  Consistent  with  this  cognitive 
state  pattern,  subject  14’s  regret  is  quite  high  until  about  trial  90,  at  which  point  it  begins 
to  steeply  decrease.  Recall  that  lower  regret  means  that  the  subject’s  decisions  are 
verging  towards  the  best  possible  decision.  Thus,  when  subject  14’s  cognitive  state  is  in 
exploration  mode,  their  regret  is  correspondingly  high.  When  their  cognitive  state 
transitions  to  exploitation,  their  regret  consistently  decreases.  In  contrast,  subject  33 
maintains  an  exploration  cognitive  state  throughout  most  of  the  task  and, 
correspondingly,  their  regret  is  consistently  high  throughout  the  task. 


l  almty  ana  fWtM  By  moi  nuno**  ft*  0014 IM  Ui«AimR*n«  t  w 


Exploration  Exploitation 


Uttocf  CVMM  In*#  nufflter  0033  fM  MtAA’mByKM  c*» 

Subject  33  latency  by  trial,  with  regret/trial  overlay. 


Figures  5a  and  5b.  Figures  5a  and  5b  illustrate  the  concordant  pattern  between 
subject’s  cognitive  state  and  their  actual  decision  perfonnance,  as  measured  by  regret,  for 
two  different  subjects.  Regret  across  the  200  trials  is  denoted  by  the  black  line. 
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We  then  examined  subject  33’s  eye  gaze  and  EEG  data  for  indicators  as  to  why 
subject  33  showed  a  nonoptimal  pattern  and  poor  decision  performance.  As  outlined  in 
Table  2,  eyetracking  data  indicates  that  subject  33  had  a  similar  eye  gaze  pattern  as  the 
overall  sample  and  that  this  subject  was  correctly  focusing  on  friendly  damage  to  a  much 
greater  extent  than  total  damage  or  enemy  damage. 


Table  2.  Comparison  of  subject  33’s  eye  gaze  pattern  compared  to  the 

overall  sample. 


Total  Damage 

Friendly 

Damage 

Enemy 

Damage 

Routes 

Mean  gaze  time  (SD),  (%) 

5.49(12.47) 

16.73  (14.87) 

6.55  (6.40) 

71.23  (19.86) 

Subject  33 

2.90 

13.96 

7.78 

75.26 

Figure  6  illustrates  the  utility  of  combining  neurophysiological  and  behavioral 
measures.  Subject  33’s  EEG  data  indicates  that  there  were  several  periods  throughout  the 
task  when  they  experienced  high  cognitive  workload.  Note  that  the  peaks  in  latency  time 
in  the  first  several  trials,  and  between  approximately  trials  160  to  170,  overlap  and/or 
precede  peaks  in  periods  of  high  cognitive  workload.  This  subject,  however,  was  also 
frequently  distracted  and  was  minimally  engaged  in  the  task.  Given  insight  into  the 
subject’s  cognitive  state  throughout  the  task,  it  is  not  that  surprising  that  subject  33 
remained  in  an  exploration  state,  had  high  regret,  and  scored  700  in  total  damage,  which 
was  well  below  the  average  of  2,402.94. 
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workload  engagement  distraction  sleep 


Figure  6.  The  proportion  of  time  that  subject  33  experienced  sleepiness,  distraction, 
high  engagement,  or  high  cognitive  workload  on  a  given  trial.  Latency  per  trial  is 

depicted  as  the  blue  line. 
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II.  SUMMARY 


The  purpose  of  this  paper  was  to  use  case  studies  to  illustrate  CAPTTIM  and  its 
potential  impact  on  current  military  training.  CAPTTIM  uses  quantitative  statistical 
methods  and  objective  neurophysiological  measures  to  complete  the  following  actions  in 
real  time:  (1)  characterize  a  trainee’s  cognitive  state  as  either  exploration  or  exploitation, 
(2)  determine  whether  cognitive  state  is  aligned  or  misaligned  with  actual  perfonnance, 
and  (3)  indicate  ways  in  which  the  training  intervention  can  be  targeted  to  address  why 
cognitive  misalignment  occurred.  Because  latency  times  and  decision  perfonnance 
measures,  such  as  regret,  are  simple  behavioral  measures  that  easily  can  be  programmed 
into  training  software,  this  process  can  be  completed  in  real  time,  with  near-immediate 
notification  that  a  training  intervention  is  required.  Neurophysiological  measures,  such 
as  eyetracking  and  EEG,  also  are  measured  continuously  and  in  real  time,  suggesting  the 
potential  for  a  near-immediate,  targeted  training  intervention.  Because  of  these 
characteristics,  CAPTTIM  has  the  potential  to  improve  current  military  training 
efficiency  and  effectiveness. 
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